Providing normative information increases intentions to accept a COVID-19 vaccine

Despite the availability of multiple safe vaccines, vaccine hesitancy may present a challenge to successful control of the COVID-19 pandemic. As with many human behaviors, people’s vaccine acceptance may be affected by their beliefs about whether others will accept a vaccine (i.e., descriptive norms). However, information about these descriptive norms may have different effects depending on the actual descriptive norm, people’s baseline beliefs, and the relative importance of conformity, social learning, and free-riding. Here, using a pre-registered, randomized experiment (N = 484,239) embedded in an international survey (23 countries), we show that accurate information about descriptive norms can increase intentions to accept a vaccine for COVID-19. We find mixed evidence that information on descriptive norms impacts mask wearing intentions and no statistically significant evidence that it impacts intentions to physically distance. The effects on vaccination intentions are largely consistent across the 23 included countries, but are concentrated among people who were otherwise uncertain about accepting a vaccine. Providing normative information in vaccine communications partially corrects individuals’ underestimation of how many other people will accept a vaccine. These results suggest that presenting people with information about the widespread and growing acceptance of COVID-19 vaccines helps to increase vaccination intentions.


Supplementary Note 1. Experiment overview
shows the recruitment materials displayed on Facebook. Figure S2 outlines the basic experimental design. Figure S3 shows the various norms shown to participants. There is variation over time as these numbers were updated to have the most recent data throughout the experiment.

Fig. S3. Treatment Variation
For each behavior ( (a) Vaccine, (b) Mask wearing, (c) Physical distancing), we plot the information provided to participants based on the broad and narrow definitions of compliance. The treatments were updated every two weeks as new waves of data were included. The points labeled "country belief" display the weighted average belief in a country of how many people out of 100 practice (or will accept, for vaccines) each behavior.

Supplementary Note 2. Variables of interest
As in our pre-analysis plan, the following variables are used in our analysis: 1. Outcomes (a) Over the next two weeks, how likely are you to wear a mask when in public?
[Always, Almost always, When convenient, Rarely, Never] • Distancing: How often are you able to stay at least 1 meter away from people not in your household? How important do you think physical distancing is for slowing the spread of COVID-19?
• Vaccine: If a vaccine for COVID-19 becomes available, would you choose to get vaccinated? This will be coded as binary indicators for the possible outcomes, grouping missing outcomes with "Don't know".
(b) Beliefs about norms. These questions will be randomized to be shown before the treatment for some respondents and after treatment for other respondents.
This will allow us to study heterogeneity in baseline beliefs, as well as ensure our randomization does impact beliefs.
• Masks: Out of 100 people in your community, how many do you think do the following when they go out in public? Wear a mask or face covering.
• Distancing: Out of 100 people in your community, how many do you think do the following when they go out in public? Maintain a distance of at least 1 meter from others.
• Vaccine: Out of 100 people in your community, how many do you think would take a COVID-19 vaccine if it were made available?
3. Additional covariates used to check balance (a) Indicators if respondents received news from the following sources and mediums: online sources, messaging apps, newspapers, television, radio, local health workers, scientists, the World Health Organization, politicians, journalists, and peers.
(b) Indicators if respondents trusted news from the same sources and mediums as above.
(c) Indicators if respondents reported engaging the following behaviors: wearing a face mask, taking herbal supplements, using homeopathic remedies, getting the flu vaccine, eating garlic, cleaning surfaces, using antibiotics, isolation, hand washing, covering their mouth when they cough, avoid sick individuals, maintain a distance from other, avoid touching their face, and caution opening mail.
(d) Indicators if respondents reported being willing to attend restaurants, parks and beaches, retail shops, schools, performances and sporting events, places of employment, places of worship, and health care facilities.
When used in analysis, we require all covariates to be before both treatment and outcome.
As the survey contains randomized order for these questions, this ensures that the distribution of question order is the same across treated and control groups and removes any imbalance created by differential attrition. Missing values are imputed at their (weighted) mean. Table S1 presents results of a test that the treatment and control shares were equal to 50% as expected. While the final dataset does have some evidence of imbalance that could be caused by differential attrition, the "robust" dataset (described in Supplementary Note 5.1) is well balanced and the treatment is balanced across the three behaviors information could be provided about (Table S2). According to our pre-registered analysis plan, in the presence of evidence of differential attrition, we make use of additional analyses that use the information about other behaviors as an alternative control group throughout this supplement. The results of a two-sided test that the treated share and control shares equal 50%. The first row uses intent-to-treat on the full set of eligible respondents, the second row uses the final data set after conditioning on eligibility and completing the survey, and the third row uses the subset of responses in the final dataset that have at least one block between treatment and outcome. The CI columns are 95% confidence intervals. The p-values of a two-sided test that each behavior was shown the expected number of times. This reports the results of a joint test that each period share was equal to the expected. For waves 9-12, each behavior was shown 1/3 of the time and for waves 12 on the vaccine treatments were shown to 2/3 of respondents and the mask treatments were shown to 1/3 of respondents. This table cannot include the full dataset intent-to-treat analysis because the behavior randomization occurred when the treatment was shown. The CI rows are 95% confidence intervals.

Supplementary Note 3. Randomization checks
In addition, baseline covariates measured before both treatment and the outcome are balanced across treatment and control groups (Table S3). The covariates are also balanced in the final analysis dataset (Table S4) and within treated users across the three possible treatment behaviors (Table S5). As a result of the large number of covariates available in the survey, these tables only include a subset of possible covariates. In Figure S4 we plot ordered p-values for the balance tests described in Tables S3, S4, and S5. In addition to the p-value from the tables, Figure S4    Ordered p-values for the (two-sided) balance tests described in Tables S3, S4, and S5 sorted in ascending order. All available pre-treatment covariates are included, which results in 76 tests. This includes all covariates reported in Table S3 in addition to questions that contain multiple responses. Full text of these questions are described in detail in (1) and are summarized in Supplementary Note 2 under additional covariates used to check balance. (a) Balance checks for intent-to-treat sample, (b) balance checks for the final sample, (c) balance checks comparing vaccine and mask treated groups, (d) balance checks comparing vaccine and physical distancing treated groups. There are no corrections applied for multiple comparisons. Pre-treatment covariate means and 95% confidence intervals for all respondents who were eligible for treatment in both the treatment and control groups along with the p-value for the two-sided test of the null that the means are equal. For each covariate, only responses where the covariate is not missing and occurs before both treatment and control are included. To account for changes to the sampling frequencies, these p-values are from the coefficient on the intent-to-treat term in a regression of the covariate on treatment, period, and centered interactions between treatment and period. As we do not have weights for all respondents, this is an unweighted regression. There are no corrections for multiple comparisons. Pre-treatment covariate means and 95% confidence intervals for all respondents who were eligible for treatment, completed the entire survey, and received a full survey completion weight in both the treatment and control groups along with their standard errors in parenthesis and the p-value for the two-sided test of the null that the means are equal. For each covariate, only responses where the covariate is not missing and occurs before both treatment and control are included. To account for changes to the sampling frequencies, these p-values are from the coefficient on the treatment term in a regression of the covariate on treatment, period, and centered interactions between treatment and period. This is a weighted regression using full completion survey weights. There are no corrections for multiple comparisons. Pre-treatment covariate means and 95% confidence intervals for all respondents who were treated, completed the entire survey, and received a full survey completion weight along with their standard errors in parenthesis and the p-value for the two-sided test of the null that the means between treatment groups are equal. For each covariate, only responses where the covariate is not missing and occurs before both treatment and control are included. To account for changes to the sampling frequencies, these p-values are from the coefficient on the treatment behavior terms in a regression of the covariate on treatment behavior, period, and centered interactions between treatment behavior and period. This is a weighted regression using full completion survey weights. There are no corrections for multiple comparisons.  Table S6. Estimates of equation 1 with beliefs about descriptive norms as outcomes. All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets.

Supplementary Note 5. Effects on intentions
Figure S6 displays regression coefficients for the primary analysis described in the methods section. Figure S6a uses respondents who receive the information after the outcome is measured as the control group and Figure S6b uses individuals who receive the information treatment for a different behavior as the control group. Table S8 presents results from the distribution regressions of the same analysis. This allows us to understand across which thresholds the treatment has induced people to cross. The coefficients indicate that the treatment is inducing people to report they will at least probably take the vaccine and definitely take the vaccine.
Similar regressions restricted to those who report they don't know if they will take the vaccine at baseline are presented in Table S9. Finally, estimates of heterogeneous treatment effects from equation 2 are reported in Tables S11 and S12 and Figure S7.  Table S7. Estimates of equation 1 with intentions as outcomes. All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets. Estimates of equation 1 with binary outcomes. The outcome variable for each column is an indicator equal to one if the respondent reported a value higher than the column name. For example, in the column "> Probably not" the outcome Y i equals one if the respondent answered "Unsure", "Probably", or "Yes, definitely". All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets. Estimates of equation 1 with binary outcomes on sample of respondents who say they don't know if they will take a vaccine at baseline. The outcome variable for each column is an indicator equal to one if the respondent reported a value higher than the column name. For example, in the column "> Probably not" the outcome Y i equals one if the respondent answered "Unsure", "Probably", or "Yes, definitely". All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets. Estimates of equation 1 with binary outcomes on sample of respondents who have a baseline beliefs about how many people in their community will take a vaccine under the narrow treatment number. The outcome variable for each column is an indicator equal to one if the respondent reported a value higher than the column name. For example, in the column "> Probably not" the outcome Y i equals one if the respondent answered "Unsure", "Probably", or "Yes, definitely". All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets. The two-sided joint test that the broad and narrow coefficients are equal across groups has a p-value of 0.807, and the two-sided test that the broad (narrow) treatment effects in the Under and Above groups are equal has a p-value of 0.38 (0.31). All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets.  Heterogeneous treatment effects based on baseline beliefs about how many people in their community will accept a vaccine. We remove respondents who say they believe 0, 50, or 100 percent of people in their community will accept a vaccine to mitigate measurement error due to a bias towards round numbers. Error bars are 95% confidence intervals. There are n=86,617 responses in this analysis. The two-sided joint test that the broad and narrow coefficients are equal across groups has a p-value of 0.012, and the two-sided test that the broad (narrow) treatment effects in the Yes and Don't know groups are equal has a p-value of <0.01 (0.05). The two sided test that the broad (narrow) treatment effects in the Don't know and No groups are equal has a p-value of 0.15 (0.08). All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets. The coefficients reported in these figures are from a regression of vaccine acceptance measured on a five point scale on indicators of the treatment number the individual was shown (if treated) grouped into bins of width 20 percentage points. We also include covariates and their interactions with treatment as in the main analysis. We show the analysis for the (a) full sample receiving the broad treatment (n=299,692), (b) full sample receiving the narrow treatment (n=298,977), (c) sample of respondents reporting "Don't know" to the baseline vaccine acceptance question and received the broad treatment (n=57,079), and (d) narrow treatment (n=57,113). Error bars are 95% confidence intervals.

Supplementary Note 5.1. Robustness checks.
A concern with survey experiments is that results could reflect researcher demand effects, where participants respond how they think the researchers would want them to respond. While we cannot rule this out completely, we do not believe this is driving our results (2,3). In this section we present results of an analysis that restricts the sample to be separated by at least one "block" of questions between them ( Figure S12 and Table S13). Table S14 shows heterogeneous treatment effects by baseline vaccine acceptance for this restricted sample. Figure S13 plots the distribution of the number of screens between treated and control. In Figure S13a, we plot the distribution for the entire sample and in Figure S13b we plot the distribution for the subset of those with at least one block between treatment and control.
For this group there are at least three pages between the treatment and outcome  Estimates of equation 1 on the restricted sample when outcome and treatment are separated by at least one additional block of questions. All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets.   Estimates of equation 1 on the restricted sample when outcome and treatment are separated by at least one additional block of questions. The outcome variable in this analysis are binary indicators if the outcome was at least a certain response as in table S8. All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets.

Supplementary Note 5.2. Unweighted estimates.
Here we show the importance of using the survey weights in estimating quantities from the survey ( Figure S14a) and the results of the main analyses replicated without using the survey weights ( Figure S14b).

Supplementary Note 6. Norm-intention correlations
In the main text, Figure 1 (inset) shows the association between beliefs about descriptive norms and intentions to accept a COVID-19 vaccine. Figure S15 disaggregates this information by country. As in the main text, this is a purely observational association but is computed on the main experimental sample (i.e., starting in late October).  People who believe a larger fraction of their community will accept a vaccine are on average more likely to say they will accept a vaccine and this is true within the 23 included countries. The vertical axis shows the percentage of respondents who replied Yes (green), Don't know (gray) and No (brown) to whether they will accept a COVID-19 vaccine.

Supplementary Note 7. Country-level validation of survey data
At the end of the survey, vaccinations were not readily available to the vast majority of people in the sample and so it is difficult to compare intentions to receive a COVID-19 vaccine with actual vaccine take-up due to supply constraints. In an attempt to quantify the correspondence between our survey responses and other measures of corresponding behavior, we run an auxiliary analysis comparing self-reported receipt of a COVID-19 vaccine from the survey with country-level uptake (COVID-19 vaccine data retrieved from https://github.com /owid/covid-19-data/tree/master/public/data/vaccinations). The estimated vaccinated share of adults from the survey measure is highly predictive of the cross-country variation in actual vaccination shares, explaining over 80% of the cross-sectional variation (Table S15). Regression of true vaccination share on estimated share vaccinated based on the survey measure. Adjustment for attenuation bias due to measurement error in the survey results in nearly identical results. All p-values are from two-sided test that coefficient is equal to zero, standard errors are in parentheses, and 95% confidence intervals are in square brackets.

Supplementary Note 8. Intention to behavior correlation
A limitation of this experiment is that we measure self-reported vaccination intentions rather than eventual vaccination. When the experiment was first fielded, vaccinations were not available to the public and this remains true for many countries studied throughout the respondents who completed both the baseline and endline survey. We then predict endline vaccination status with baseline vaccination intentions and this is plotted inf Figure S16. Our vaccination intentions measure is quite predictive of future vaccination status, with over half of those responding "Yes, definitely" having received at least one dose of a vaccine two months later. We also ask for the approximate date of when they received their vaccine and plot the distribution of acceptance over time in Figure S17, and it is clear that those with stronger intentions to receive a vaccine not only receive the vaccine at higher rates but also do so more quickly. Distribution regression of the date that someone received their COVID-19 vaccine by baseline intention group. Those who are unvaccinated at endline are coded as 1,000 so the line plots the share who have received at least one dose over time. There is some mismeasurement, as some respondents reported not having received a vaccine in the baseline survey (April), while saying they received their first dose in March during the endline survey. We plot bootstrapped 95% uniform confidence intervals of the cumulative distribution functions (CDFs) centered around the empirical CDFs. There are n=1,350 respondents in this analysis.